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Abstract 

The theory of quantum error correction is a cornerstone of quantum 
information processing. It shows that quantum data can be protected 
against decoherence effects, which otherwise would render many of the 
new quantum applications practically impossible. In this paper we give 
a self contained introduction to this theory and to the closely related 
concept of quantum channel capacities. We show, in particular, that it 
is possible (using appropriate error correcting schemes) to send a non- 
vanishing amount of quantum data undisturbed (in a certain asymptotic 
sense) through a noisy quantum channel T, provided the errors produced 
by T are small enough. 
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1 Introduction 



Controling decoherence is one of the key problems for making quantum informa- 
tion processing and quantum computation work. From the outset, when Peter 
Shor announced his algorithm (l^, , many physicists felt that somewhere there 
would be a price to pay for the miraculous exponential speedup. For example, if 
the algorithm would require exponentially good adherence to specifications for 
the quantum circuitry and exponentially low noise levels, it would have been 
totally useless. Indeed it is far from easy to show that it does not make such 
requirements. 

In this article we look at the simpler, but equally fundamental problem 
of quantum information transmission or storage. Is it possible to encode the 
quantum data in such a way that even after some degradation they can be 
restored nearly perfectly by a suitable decoding operation? Assuming that the 
degrading decoherence effects are small to begin with, can restoration be made 
nearly perfect? 

For classical information it is very simple to do this, namely by redundant 
coding. If we want to send one bit through a noisy channel, we can reduce 
errors by sending it three times and deciding by majority vote which value we 
take at the output. Clearly, if errors have a small probability e for a single 
channel, they will have order e 2 for the triple channel, because we go wrong 
only when two independent errors occur. Unfortunately, such a scheme cannot 
work in the quantum case because it involves a copying operation, which is 
forbidden by the No-Cloning Theorem pj| . So we have to look for subtler ways of 
distributing quantum information among several systems and thereby reducing 
the probability of errors. Indeed such schemes exist |3|, [2(j] and are the subject 
of the exciting new field of quantum error correcting codes. 

The efficiency of such a scheme is measured by two parameters, namely how 
many uses of the noisy channel are required, and the error level after correction. 
The above simple classical scheme can be iterated to get the errors for a single 
bit down to e 2 with 3" parallel uses of the channel. This is a large overhead 
to correct a single bit. Better procedures work classically by coding several bits 
at a time, and one can manage to make errors as small as desired with only a 
finite overhead per bit. The minimal required overhead (or rather its inverse) 
is, in fact, the central quantity of the coding theory fT7j| for noisy channels: one 
defines the capacity of a channel as the number of bit transmissions per use of 
the channel, in an optimal coding scheme for messages of length L — > oo with 
the property that the error probability goes to zero in this limit. 

It is not a priori clear that the notion of channel capacity makes sense for 
quantum information, i.e. that the capacity of a channel which produces only 
small errors is nonzero and close to that of the ideal (errorless) channel. This 
is indeed not even evident from most existing presentations of the theory of 
quantum error correcting codes. Papers which address this problem at least for 
special cases like depolarizing channels are [E| || and |L5| Sec 7.16.2] while the 
general case is treated more recently in |?], |12fl . The purpose of this paper is less 
the presentation of new results but to show in an elementary and self-contained 
way that small quantum errors can be corrected with an asymptotically small 
effort. To this end the paper is organized as follows. We first review the basic 
notions concerning quantum channels (Section^), and give an abstract definition 
of the capacity together with some elementary properties (Section 0). Then we 
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discuss the theory of error correcting codes (Section and a particular scheme 
to construct such codes which is based on graph theory (Section ||) . In Section 
and [7] we apply this scheme to channel capacities and finally we draw our 
conclusions in Section 0. 



2 Quantum channels 

According to the rules of quantum mechanics, every kind of quantum systems 
is associated with a Hilbert space TL, which for the purpose of this article we 
can take as finite dimensional. Since even elementary particles require infinite 
dimensional Hilbert spaces, this means that we are usually only trying to coher- 
ently manipulate a small part of the system. The simplest quantum system has 
a two dimensional Hilbert space TL = C 2 , and is called a qubit, for 'quantum bit'. 
The observables of the system are given by bounded operators. This space will 
be denoted by B(TL). The preparations (states) are given by density operators 
p £ B*(TL), where the latter denotes the space of trace class operators on TL. 
Of course, on finite dimensional Hilbert spaces all linear operators are bounded 
and trace class. So we use this notation mostly to keep track of the distinction 
between spaces of observables and spaces of states. 

A quantum channel, which transforms input systems described by a Hilbert 
space Tii into output systems described by a (possibly different) Hilbert 
space TL2 is represented mathematically by a completely positive, unital map 
T : B(H 2 ) -> B(Hi). Each T can be written in the form jjlj 

n 

T{A) = Y. F i AF ^ W 
i=i 

where the Fj are (bounded) operators TL2 — * TLi, called Kraus operators. The 
equivalence of this form to the condition of complete positivity is a simple 
consequence of the Stinespring theorem |2l]] . 

The physical interpretation of T is the following. The expectation value of 
an A measurement (A £ BLTL2)) at the output side of the channel, on a system 
which is initially in the state p £ B*(TCi) is given in terms of T by tr[/>T(A)]. 
Alternatively we can introduce the map T* : B*(TLi) — > B*(TL2) which is dual to 
T, i.e. tr[T*(/9)^4] = tr[pT(A)]. It is uniquely determined by T (and vice versa) 
and we can say that T* represents the channel in the Schrddinger picture, while 
T provides the Heisenberg picture representation. 

Let us consider now the special case that TL\ — TL2 — TL. For example T 
describes the transmission of photons through an optical fiber or the storage 
in some sort of quantum memory. Ideally we would prefer channels which do 
not affect the information at all, i.e. T = Id, the identity map on B(TC). We 
will call this case the ideal channel. In real situations, however, interaction with 
the environment, i.e. additional, unobservable degrees of freedom, can not be 
avoided. The general structure of such a noisy channel is given by 

p~T,(j>) = tr K (U(j)®po)U*). (2) 

where U : TL ® K, — > H®/Cisa unitary operator describing the common 
evolution of the system (Hilbert space TL) and the environment (Hilbert space 
K,) and po £ S(fC) is the initial state of the environment (cf. Figure |l|). Note 
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Figure 1: Noisy channel 



that each T can be represented in this way (this is again an easy consequence 
of the Stincspring theorem), however there are in general many possible choices 
for such an "ancilla representation" . 



3 Channel capacities 

As we have already pointed out in the introduction, the capacity of a quantum 
channel is, roughly speaking, the number of qubits transmitted per channel 
usage. In this section we will come to a more precise description. 

3.1 The cb-norm 

As a first step we need a measure for the difference between a noisy channel 
T : 13(H) — > 13(H) and its ideal counterpart. There are several mathematical 
ways of expressing this, which turn out to be equivalent for our purpose. We find 
it most convenient to take a certain norm difference, i.e., to consider \\T — Id || c b 
as a quantitative description of the noise level in T, where || • || c b denotes a 
certain norm, called the norm of complete boundedness ("cb-norm" for short). 
Its physical meaning is that of the largest difference between probabilities mea- 
sured in two experimental setups, differing only by the substitution of T by 
Id. Since this setup may involve further subsystems, and the measurement and 
preparation may be entangled with the systems under consideration, we have 
to take into account such additional systems in the definition of the norm. For 
a general linear operator T : B(H2) — ► B(Hi) we set 

||T|| cb = sup{|KT®Id n )(A)|| | nEN;AeB(H 2 ®C n );\\A\\ < l} . (3) 

The cb-norm improves the sometimes annoying property of the usual operator 
norm that quantities like \\T <g> Idg^d) || may increase with the dimension d. On 
infinite dimensional Hilbcrt spaces ||T|| c b can be infinite although the supremum 
for every fixed n is finite. A particular example for a map with such a behavior 
is the transposition. A map with finite cb-norm is therefore called completely 
bounded. In a finite dimensional setup each linear map is completely bounded. 
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For the transposition 9 on C d we have in particular ||G|| c b = d. The cb-norm has 
some nice features which we will use frequently. This includes its multiplicativity 
®T 2 ||cb = ||Ti||cb ||T 2 ||cb and the fact that ||T|| cb = 1 for every channel. For 
more properties of the cb-norm we refer to . 

3.2 Achievable rates and capacity 

How can we reduce the error level \\T — Id || c b? As an example, consider a small 
unitary rotation, i.e., T{X) = U*XU, with ||T-Id|| cb < 2||f7-l|| small. Then 
if we know U, it is easy to correct T by the inverse rotation, either before T, 
as an "encoding", or afterwards, as a "decoding" operation. More generally, 
we may use both, i.e., we are trying to make the combination ETD « Id, 
by careful choice of the channels E and D. Note that in this way we may 
look at channels T, which have different input and output spaces, and hence 
cannot be compared directly with the ideal channel on any system. For such 
channels there is no intrinsic way of defining "errors" as deviations from a 
desired standard. Moreover, we are free to choose the Hilbert space Ho such 
that ETD : B{H ) ->• B(H Q ). For the product ETD to be defined, it is then 
necessary that D : B{H ) -> B(H 2 ) and E : B(Hi) -> B(H ). The best error 
level we can achieve deserves its own notation. We define 

A(T, M) = inf \\ETD - Id || cb , (4) 

E.D 

where the infimum is taken over all encodings E and decodings D and M is the 
dimension of the space Ho - Now for longer messages, e.g., a message of m qubits 
(so that M = 2 m ) we need to use the channel more often. In the language of 
classical information theory, we are using longer code words, say of length n. The 
error for coding m qubits through n uses of the channel T is then A(T®™, 2" 1 ). 
Can we make this small while retaining a good rate m/n of bits per channel? 
Clearly there will be a trade-off between rate and errors, which is the basis of 
the following Definition. The notation [x\ , read "floor x" , denotes the largest 
integer < x. 

Definition 3.1 c > is called achievable rate for T, if 

lim A(T®", L2 cn J) = 0. (5) 

n — >oo 

The supremum of all achievable rates is called the quantum-capacity of T and 
is denoted by Q{T). 

Because c = is always an achievable rate we have Q{T) > 0. On the other 
hand, if every c > is achievable we write Q(T) = oo. 

Often a coding scheme construction does not work for arbitrary integers, but 
only for specific values of n, or the dimension of the coding space. However, this 
is no serious restriction, as the following Lemma shows. 

Lemma 3.2 Let (n a ) ae fq be a strictly increasing sequence of integers such that 
limQ, n Q +i/n Q = 1. Suppose M a are integers such that lim a A(T® na , M a ) = 0. 
Then any 

c < lim inf — — — — (6) 
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is an admissible rate. Moreover, if the errors decrease exponentially, in the sense 
that A(T®"° , M a ) < /ie~ An ° (fi, A > 0), then they decrease exponentially for all 
n with rate 

liminf — logA(T®", L2 C "J) > A. (7) 

n — >oc n 

Proof. Let us introduce the notation c + = liminf a (log 2 M a )/n a , so c < c+. 
We pick r\ > such that (1 + rf)c < c + . Then for sufficiently large a > ao we 
have (n a+ i/n a ) < (1 + n), and (log 2 M a /n a ) > (1 + n)c. Now let n > n a<3 , and 
consider the unique index a such that n a < n < n a+ \. Then n < (1 + n)n a and 

[2 C "J < 2 cn < 2 c ^ 1+ri)n " < M a . (8) 

Clearly, A(T® n , M) decreases as n increases, because good coding becomes eas- 
ier if we have more parallel channels and increases with M, because if a coding 
scheme works for an input Hilbert space 7io, it also works at least as well 
for states supported on a lower dimensional subspace. Hence A(T®", [2 CII J) < 
A(T® n <*, M a ) -> 0. It follows that c is an admissible rate. 
With the exponential bound on A we find similarly that 

A(T®", L2 cn J) < yt e~ A "° < n e - A /( 1+ ")", (9) 

so that the liminf in (^) is > A/ (1 + rf). Since r\ was arbitrary, we get the desired 
result. □ 



3.3 Elementary properties 



To determine Q(T) in terms of Definition 3.1 is fairly difficult, because optimiza- 
tion problems in spaces of exponentially fast growing dimensions are involved. 
This renders in particular each direct numerical approach practically impossi- 
ble. In the classical situation, i.e. if we transfer classical information through a 
classical channel $, we can define a capacity quantity C($) in the same way 
as above. An explicit calculation of C($), however, can be reduced, according 
to Shannons "noisy channel coding theorem" p"7| , to an optimization problem 
over a low dimensional space, which does not involve the limit of inifinitely 
many parallel channels. A similar coding theorem for the quantum case is not 
yet known - this is the biggest open problem concerning channel capacities. 

Nevertheless, there are some special cases in which the capacity can be com- 
puted explicity. The most relevant example is the ideal channel Id = Idg(c d )- If 
d n > M we can embed C M into (C d )®", hence A(Id®", M) = and we see that 
the rate \og 2 (d) can be achieved. Intuitively we expect that this is the best what 
can be done, because it is impossible to embed a high- into a low-dimensional 
space. This intuition is in fact correct, i.e. we have Q(Id) = \og 2 (d) for the ideal 
channel. A precise proof of this statement is, however, not so easy as it looks 
like and we skip the details here. Maybe the most easy approach is to use the 
quantity log 2 (||6T|| c b) (where O denotes the transposition), which is an upper 
bound on Q(T) (cf. ||] or ^2|). The same idea can be used to show that the 
quantum capacity of a classical channel, or more generally a channel T which 
uses classical information at an intermediate step, is zero. This is a reformulation 



of the "no classical teleportation theorem" (cf. again 22 ). 
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Another useful relation concerns the concatenation of two general channels 
T\ and T 2 : We transmit quantum information first through T\ and then through 
T%. It is reasonable to assume that the capacity of the composition T 2 T\ can not 
be bigger than the capacity of the channel with the smallest bandwidth. This 
conjecture is indeed true and known as the " Bottleneck inequality" : 

Q(r 2 Ti) < min{QPi), Q(T 2 )}. (10) 

Alternatively we can use the two channels in parallel, i.e. we consider the tensor 
product T\ <E> T 2 . In this case the capacity of the resulting channel is at least as 
big as the sum of Q(T\) and Q(Ta), i.e. Q is super additive: 

Q{Tx®T 2 )>Q(T x ) + Q{T 2 ) (11) 

(cf. ||] f° r a proof of both statements). To decide whether Q is even additive, 
i.e. whether equality holds in (|Tl|), is another big open question about channel 
capacities. 



4 Quantum error correction 

The definition of capacity requires that we correct errors in a collection of n 
parallel channels T® n . Here the tensor product means that successive uses of the 
channel are independent. For example, the physical system used as a carrier is 
freshly prepared every time we use the channel. This independence is important 
for error correcting schemes, because it prevents errors happening on different 
channels to "conspire" . 

Suggestive as it may be, quantum mechanics cautions us to be very careful 
with this sort of language: just as we cannot assign trajectories to quantum 
systems, it is problematic to speak about errors 'happening' in one channel, in a 
situation where we must expect different classical pictures to 'occur' in quantum 
mechanical superposition. This is to be kept in mind, when we now describe the 
theory of quantum error correcting codes in the sense of Knill and Laflamme 
[fl0| , which is very much based on a classification of errors according the place 
where they occur. For example, the coding/decoding pair E, D will typically 
have the property that E(T\ ® T 2 <E> ■ ■ • ® T n )D — Id, whenever the number of 
positions at which Tj ^ Id, i.e., the number of errors, is small (cf. Figure |J). 

In our presentation of the Knill-Laflamme Theory, we start from the error 
corrector's dream, namely the situation in which all the errors happen in an- 
other part of the system, where we do not keep any of the precious quantum 
information. This will help us to characterize the structure of the kind of errors 
which such a scheme may tolerate, or 'correct'. Of course, the dream is just a 
dream for the situation we are interested in: several parallel channels, each of 
which may be affected by errors. But the splitting of the system into subsystems, 
mathematically the decomposition of the Hilbert space of the total system into 
a tensor product is something we may change by a suitable unitary transforma- 
tion. This is then precisely the role of the encoding and decoding operations. 
The Knill-Laflamme theory is precisely the description of the situation where 
such a unitary, and hence a coding/decoding scheme exists. Constructing such 
schemes, however, is another matter, to which we will turn in the next section. 
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4.1 An error corrector's dream 



So consider a system split into Ti = TL g ® Tib, where the indices g and b stand 
for 'good' and 'bad'. We prepare the system in a state p ® |f2)(fi|, where p 
is the quantum state we want to protect. Now come the errors in the form 
of a completely positive map T(A) = F*AFi. Then according to the error 
corrector's dream, we would just have to discard the bad system, and get the 
same state p as before. 

The hardest demands for realizing this come from pure states p = \(f))((f>\, 
because the only way that the restriction to the good system can again be |</>}{<^| 
is that the state after errors factorizes, i.e. 

r*(|0®n)(^®n|) =J2\Fi(<p®n))(Fi((j)®ri)\ = \4>){4>\®a. (12) 

i 

This requires that 



ft) = <t> ® , 



(13) 
if such an 



where $i G Tib is some vector, which must be independent of 
equation is to hold for all </> 6 TL g . Conversely, condition ( |l3| ) implies ( |l2| ) for 
every pure state \4>){4>\ and, by convex combination, for every state p. 

Two remarks are in order. Firstly, we have not required that Fi = I <g> F[. 
This would be equivalent to demanding that this scheme works with every Q, 
or indeed with every (possibly mixed) initial state of the bad system. This 
would be much too strong for a useful theory of codes. So later on we must 
insist on a proper initialization of the bad subsystem by a suitable encoding. 
Secondly, if we have the condition (jl^) for the Kraus operators of some channel 
T, then it also holds for all channels whose Kraus operators can be written as 
linear combinations of the Fi. In other words, the "set of correctible errors" is 
naturally identified with the vector space of operators F such that there is a 
vector $eHt with F(cj> ® ft) — 4> ® $ for all (j> £ Tig- This space will be called 
the maximal error space of the coding scheme, and will be denoted by £ max . 
Usually, a code is designed for a given error space £. Then the statement that 
these given errors are corrected simply becomes £ C £ m ax- The key observation, 
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Figure 2: Five bit quantum code: Encoding one qubit into five and correcting 
one error. 
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however, is that the space of errors is a vector space in a natural way, i.e., if we 
can correct two types of errors, then we can also correct their superposition. 



4.2 Realizing the dream by unitary transformation 

Let us now consider the situation in which we want to send states of a small 
system with Hilbert space Hi through a channel T : 8(0.2) — > 8(0.2). The 
Kraus operators of T lie in an error space £ C 8(0.2), which we assume to be 
given. No more assumptions will be made about T. Our task is now to devise 
coding E and decoding D so that ETD is the identity on 8(Oi). 

The idea is to realize the error corrector's dream by suitable encoding. The 
'good' space in that scenario is, of course, the space Oi. We are looking for a 
way to write O2 — Oi ®Ob- Actually, an isomorphism may be asking too much, 
and we look for an isometry U : Oi ® Ob — > O2 ■ The encoding, written best in 
the Schrodinger picture, is tensoring with an initial state f2 as before, but now 
with an additional twist by U: 

E,(p) = U(fi®\n)(Sl\)U* . (14) 

The decoding operation D is again taking the partial trace over the bad space 
Ob, after reversing of U. Since U is only an isometry and not necessarily unitary 
we need an additional term to make D unit preserving. The whole operation is 
is best written in the Heisenberg picture: 

D(X) = U(X ® 1)U* + tr(p ^)(J - UU*) , (15) 

where po is an arbitrary density operator. These transformations are successful, 
if the error space (transformed by U) behaves as before, i.e., if for all F 6 £ 
there are vectors £ Ob such that, for all <j> G Oi 

FU(<j>®0.) = U((f>®<f>(F)) (16) 

holds. This equation describes precisely the elements F 6 £ max of the maximal 
error space. 

To check that we really have ETD = Id for any channel T(A) = £\ F* AF. k 
with Fi € £max, it suffices to consider pure input states \4>)((f)\, and the mea- 
surement of an arbitrary observable X at the output: 

tr[\<f)){<f>\ETD(X)] = ^tr[U\<f> ® O)(0® Q.\U*F l U(X <g> t)WF t ] 

i 

= tr [\4> <8 $(#)) (4> ® §(Fi)\X ® 1] 

i 

= {<f>,Xcl>)J2\\HFi)f = (<t>,X<l>)- (17) 

i 

In the last equation we have used that J^. ||$(.Fj)|| = 1, since E, T, and D each 
map 1 to I. 



4.3 The Knill-Laflamme condition 



The encoding E defined in Equation (14) is of the form E Sf (p) — VpV* with the 
encoding isometry V : 0\ — > O2 given by 



Vcj) = u((j) <g> n) . 



(18) 
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If we just know this isometry and the error space we can reconstruct the whole 
structure, including the decomposition TL 2 — Hi<&Hb®('K- — UU*)H 2 , and hence 
the decoding operation D. A necessary condition for this, first established by 
Knill and Laflamme jl(J, is that, for arbitrary (f>i,(f>2 6 Ti-i and error operators 
Fi,F 2 eS: 

{V(j>x,FlF 2 V(j> 2 ) = (fa, cf> 2 )uj(F*F 2 ) (19) 

holds with some numbers a->(i 7 ' 1 *i 7 2) independent of 4>x,4> 2 . Indeed, from ( |l6|) 
we immediately get this equation with w(i 7 ' 1 *i 7 2) = ($>(Fi), ^(i 7 ^))- Conversely, 
if the Knill-Laflamme condition (|l9| ) holds, the numbers u}(F^F 2 ) serve as a 
(possibly degenerate) scalar product on £ , which upon completion becomes the 
'bad space' Tib, such that F £ £ is identified with a Hilbert space vector $>(F). 
The operator U : 4>®$(F) = FVtfi is then an isometry, as used at the beginning 
of this section. To conclude, the Knill-Laflamme condition is necessary and 
sufficient for the existence of a decoding operation. Its main virtue is that we 
can use it without having to construct the decoding explicitly. 

4.4 Example: Localized errors 

Let us come back to the problem we are addressing in this paper. In that case 
the space TL2 is the n-fold tensor product of the system Ti on which the noisy 
channels under consideration act. We say that a coding isometry V : Hi — > Ti® n 
corrects f errors, if it satisfies the Knill-Laflamme condition ( |l9| ) for the error 
space £f spanned linearly by all operators of the kind Xi Cg) A2 ® • ■ -® X n , where 
at most / places we have a tensor factor Xi ^ I. 

When Fi and F 2 are both supported on at most / sites, the product F*F 2 , 
which appears in the Knill-Laflamme condition involves 2f sites. Therefore we 
can paraphrase the condition by saying that 

(V4>i,XVfa) = {4>i,fa)w(X) (20) 

for X £ £2/- From Kraus operators in £f we can build arbitrary channels of the 
kind T — T\ ®T2 ® ■ ■ ■ ®T n , where at most / of the tensor factors Ti are channels 
different from Id. We will use this in the form that E(Ri ®R 2 eg) • ■ -<E>R n )D = 0, 
whenever at most / tensor factors are Ri ^ Id, and at least one of them is a 
difference of two channels. 

There are several ways to construct error correcting codes of this type (see 
e.g. [|| |l|). Most appropriate for our purposes is the scheme proposed in 
which is quite easy to describe and admits a simple way to check the error 
correction condition. This will be the subject of the next section. 

5 Graph Codes 

The general scheme of graph codes works not just for qubits, but for any dimen- 
sion d of one site spaces. The code will have some number m of input systems, 
which we label by a set X, and, similarly n output systems, labeled by a set Y. 
The Hilbert space of the system with label x G X U Y will be denoted by TC X 
although all these are isomorphic to C d , and are equipped with a special basis 
\j x ), where j x £ is an integer taken modulo d. As a convenient shorthand, 
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we write jx for a tuple of j x G Z<j, specified for every x G X. Thus the \jx) 
form a basis of the input space Hx = <SWx of the code. An operator F, 
say, on the output space will be called localized on a subset Z C Y of systems, 
if it is some operator on &) ye z'Hyi Censored with the identity operators of the 
remaining sites. 

The main ingredient of the code con- 
struction is now an undirected graph with 
vertices XUY. The links of the graph are 
given by the adjacency matrix, which we 
will denote by T. When we have \X\ = rn 
input vertices and \Y\ = n output ver- 
tices, this is an (n + m) x (n + m) matrix 

Fi re 3 T o ah codes with T xy = 1 if node x and y are linked 

igure . wo grap co es. _ q therwise. We do allow mul- 

tiple edges, so the entries of T will in general be integers, which can also be 
taken modulo d. It is convenient to exclude self-linked vertices, so we always 
take T xx = 0. 

The graph determines an operator V = Vr ■ "Hx —> 7~Ly by the formula 




{jy\Vr\jx) = d n/ ' 2 exp j XU Y ■ T ■ jxuY^j 



(21) 



where the exponent contains the matrix element of T 



jXUY ■ T • jxuY = X! Jx^xyjy . (22) 
x,y£XUY 

Because T is symmetric, every term in this sum appears twice, hence adding a 
multiple of d to any j x or Y xy will change the exponent in ( ^l|) by a multiple of 
2-7T, and thus will not change Vr- 

The error correcting properties of Vr are summarized in the following result 
[ jl6| . It is just the Knill-Laflamme condition with a special expression for the 
form u>, for error operators such that F*F% is localized on a set Z . 

Proposition 5.1 Let T be a graph, i.e., a symmetric matrix with entries 
T xy G Zd, for x, y £ (X U Y). Consider a subset Z C Y , and suppose that the 
(Y\Z) x (X U Z)-submatrix ofT is non-singular, i.e., 

^yeY\z X! ^y* hx = implies y xe xuz h x = (23) 
xexuz 

where congruences are mod d. Then, for every operator F G B{TLy) localized 
on Z , we have 

V^FVv = dr n tr(F)! x (24) 

Proof. It will be helpful to use the notation for collections of variables, already 
present in (|2^) more systematically: for any subset W C X U Y we write jw fo r 
the collection of variables j y with y G W. The Kronecker-Delta 5(jw) is defined 
to be zero if for any y G W j y ^ 0, and one otherwise. By jw ■ T^w' 1 kw' w e 
mean the suitably restricted sum, i.e., ^2 xe wyew Jx^xyky The important sets 
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to which we apply this notation are X 1 — (XUZ) and Y' = Y\Z. In particular, 
the condition on T can be written as Ty'X'jx' = => jx' = 0. 
Consider now the matrix element 

(jx\V r *FV r \k x ) = J2 (jx\V r *\j Y }(jY\F\k Y }(k Y \V r \kx) (25) 

jy ,ky 

= d~ n e ^( fexui " r - W - ixuY ' r ' jW ) ( jY \F\k Y ) 

jy,ky 

Since F is localized on Z , the matrix element contains a factor Sj for every 
y E Y \ Z = Y', so we can write (j Y \F\k Y ) = (jz\F\k z )5(j Y > — fey/). Therefore 
we can compute the sum ( |25| ) in stages: 

(jx\V r *FVr\k x )= J2 (jz\F\k z )Stix>,k x >) , (26) 

jz,k z 

where S(jx> , kx>) is the sum over the y'-variables, which, of course, still depends 
on the input variables jx, kx and the variables jz, kz at the error positions: 

S(jx',k x >) = d n J2 s 0Y'-k Y >)e d \ J (27) 

The sums in the exponent can each be split into four parts according to the 
decomposition X' vs. Y'. The terms involving Ty/ Y / cancel because k Y > = j Y <- 
The terms involving Tx> Y > and Ty>x> are equal because T is symmetric, and 
together give 2j Y > ■ T Y >x' ■ {kx 1 — jx')- The Tx'x> remain unchanged, but only 
give a phase factor independent of the summation variables. Hence 

S(j X ',kx>) = d- n e 1 ^( kx '- r - kx '^ :ix '- r - 3x ')j2 e2fljY '' rY ' x '' {kx '~ : ' x ' ) 

Jy' 

= trr te ^(k x ,-r.k x ,-j xf -r.j xt ) d \Y'\ S (T Y , X , ■ (k x - - jx>)) 

= d- n ^ Y '^e^( kx '- r - kx '-^'- r -^')s(k x , - Jx ,) 

= d- n+ \ Y '\5{k x > -jx>) ■ (28) 

Here we used at the first equation that the sum is a product of geometric series 
as they appear in discrete Fourier transforms. At the second equality the main 
condition of the Proposition enters: if J2xex' ^yx-{k x —j x ) vanishes for all y <EY' 
as required by the delta-function then (and only then) the vector kx 1 — jx 1 must 
vanish. But then the two terms in the exponent of the phase factor also cancel. 
Inserting this result into (|2^), and using that 8{hx') = S(hx)S(hz), we find 

(jx\V r *FV r \k x ) - S(j x - k x ) tT"^'' Y^{jz\F\j z ) 
= S(j x - k x ) d- n J2(jy\F\j Y ) 

JY 

Here the error operator is considered in the first line as an operator on Tiz, and 
as an operator on TL Y in the second line, by tensoring it with ly . This cancels 
the dimension factor d!> Y ' □ 
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All that is left to get an error correcting code is to ensure that the conditions 
of this Proposition are satisfied sufficiently often. This is evident from combining 
the above Proposition with the example at the end of Section 4.3. 



Corollary 5.2 Let T be a graph as in the previous Proposition, and suppose 
that the (Y \ Z) x (X U Z)-submatrix of T is non-singular for all Z C Y with 
up to If elements. Then the code associated to T corrects f errors. 

Two particular examples (which are equivalent!) are given in Figure |3[ In 
both cases we have N = 1, M — 5 and K — 1 i.e. one input node, which can be 
chosen arbitrarily, five output nodes and the corresponding codes correct one 
error. 



6 Discrete to continuous error model 

The discrete error correction scheme described in the last section is not really 
designed to correct small errors: it corrects rare errors in multiple applications of 
the channel. A typical example of a small (but not rare) error is a small unitary 
rotation, T(X) = U*XU . Then \\T — Id || c b can be small, but since the same 
small error happens to each of the parallel channels in T® n , the error syndromes 
of discrete error correction at first sight do not seem to be appropriate at all. 
Nevertheless, the discrete theory can be applied, and this is the content of the 
following Proposition. It is the appropriate formulation of "reducing the order 
of errors from e to 

Proposition 6.1 Let T : B(Ti.) — > B(H) be a channel, and let E, D be encoding 
and decoding channels for coding m systems into n systems. Suppose that this 
coding scheme corrects f errors, and that 

||T - Id l| cb < (/ + l)/(n - / - 1). (29) 

Then 

\\ET® n D - Id || cb < \\T - Id 1 2 nH ^f +1 ^^ , (30) 

where i?2(?') = — r ^°S2 r — (1 — r) log 2 (l — r) denotes the Shannon entropy of 
the probability distribution (r, 1 — r). 

Proof. Into ET® n D, we insert the decomposition T — Id +(T — Id) and expand 
the product. This gives 2 n terms, containing tensor products with some number, 
say k, of tensor factors (T — Id) and tensor factors Id on the remaining (n — k) 
sites. Now when fc < /, the error correction property makes the term zero. 
Terms with k > / we estimate by \\T — Id ||* b . Collecting terms we get 

\\ET*»D-1dU< J2 f?V- Id llcb- ( 31 ) 
fe =/+i w 

The rest then follows from the next Lemma (with r = (f + 1)/^)- It treats the 
exponential growth in n for truncated binomial sums. 
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Lemma 6.2 Let < r < 1 and a > suc/i ttai a < r/(l — r). Then, for all 
integers n: 



(32) 

Proof. For A > we can estimate the step function by an exponential, and get 



trn x 7 fc=0 x 



a fc e A(fc-rn) 



_ — Arn / 1 i „ _A\ 71 



l + ae A ) = M(A)" (33) 

with M(A) = e~ Ar (l+ae A ). The minimum over all real A is attained at ae mln = 
r/(l — r). We get A m ; n > precisely when the conditions of the Lemma are 
satisfied, in which case the bound is computed by evaluating M(A). □ □ 



Suppose now that we find a family of coding schemes with n, m — * oo with 
fixed rate r « (m/n) of inputs per output, and a certain fraction f/n rj e of 
errors being corrected. Then we can apply the Proposition and find that the 
errors can be estimated above by 



A(T® n ,d m ) < (2 H ^ ||r-Id||= b ) , (34) 



where d is the Hilbert space dimension of each input system. This goes to zero, 
and even exponentially to zero, as soon as the expression in parentheses is < 1 . 
This will be the case whenever \\T — Id || c b is small enough, or, more precisely, 

||T-Id|| cb < 2- H ^/ s . (35) 

Note in addition that we have for all n G N 

e _ I 

2 H 2 (e)/e < „ _ ( 3 g) 

Hence the bound from Equation ( p9| ) is implied by (|35). 

The function appearing on the right hand side of (3a) looks rather compli- 
cated, so we will often replace it by a simpler one, namely 

- < 2- H ^/ e , (37) 
e 

where e is the base of natural logarithms; cf. Figure ^. The proof of this in- 
equality is left to the reader as exercise in logarithms. The bound is very good 
(exact to first order) in the range of small e, in which we are most interested 
anyhow. In any case, from \\T — Id || c b < s/e we can draw the same conclusion as 
from ([351): exponentially decreasing errors, provided we can actually find code 
families correcting a fraction e of errors. This will be the aim of the next section. 
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Figure 4: The two bounds from Equation (|37|) plotted as a function of e. 



7 Coding by random graphs 

Our aim in this section is to apply the theory of graph codes to construct a family 
of codes with positive rate. It is not so easy to construct such families explicitly. 
However, if we are only interested in existence, and do not attempt to get the 
best possible rates, we can use a simple argument, which shows not only the 
existence of codes correcting a certain fraction of errors, but even that "typical 
graph codes" for sufficiently large numbers of inputs and outputs have this 
property. Here "typical" is in the sense of the probability distribution, defined 
by simply setting the edges of the graph independently, and each according 
to the uniform distribution of the possible values of the adjacency matrix. For 
the random method to work we need the dimension of the underlying one site 
Hilbert space to be a prime number. This curious condition is most likely an 
artefact of our method, and will be removed later on. 

We have seen that a graph code corrects many errors if certain submatrices 
of the adjacency matrix have maximal rank. Therefore we need the following 
Lemma. 

Lemma 7.1 Let d be a prime, M < N integers and let X be an N x M -matrix 
with independent and uniformly distributed entries in Zj. Then X is singular 
over the field Z^ with probability at most d~^ N ~ M \ 

Proof. The sum of independent uniformly distributed random variables in 
Zrf is again uniformly distributed. Moreover, since d is prime, this distribu- 
tion is invariant under multiplication by non-zero factors. Hence if Xj G Z^ 
(j = 1,... , iV)are independent and uniformly distributed, and <j>j € Z^ are 
non-random constants, not of all of which are zero, *Yl!j=\ x i < i > i ^ s uniformly 
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distributed. Hence, for a fixed vector <fr 6 , the N components (X(f>)k — 
Sj=i ■X-kj'fij are independent uniformly distributed random variables. Hence 
the probability for X(j> = for some fixed 4> 7^ is d~ N . Since there are d — 1 
vectors <fr to be tested, the probability for some <fr to yield X<j) — is at most 

d M-N_ n 



Proposition 7.2 Let d be a prime, and let T be a symmetric (n + m) x (n + m)- 

matrix with entries in Z c i, chosen at random such that Tkk — and that the Tkj 
with k > j are independent and uniformly distributed. Let P be the probability 
for the corresponding graph code not to correct f errors (with 2/ < n). Then 

1 (m 4/ \ , (2f 



log P < 



P + ^-lWW^). (38) 



Proof. Each error configuration is an 2 /-element subset of the n output nodes. 
According to Proposition ... we have to decide, whether the corresponding (n — 
2f) x (m + 2/)-submatrix of L, connecting input and error positions with the 
remaining output positions, is singular or not. Since this submatrix contains 
no pairs T^- , Fji , its entries are independent and satisfy the conditions of the 
previous Lemma. Hence the probability that a particular configuration of e 
errors goes uncorrected is at most d( m + 2 /) _ ("~ 2 /). Since there are ( 2 ™^) possible 
error configurations among the outputs, we can estimate the probability of any 
2/ site error configuration to be undetected as less than ( 2 ™ )d m_n+4 A Using 



Lemma 6.2 we can estimate the binomial as log ( 2 ™ f ) < nH%(2f /n), which leads 



to the bound stated. □ 



In particular, if the right hand side of the inequality in (|38|) is negative, we 
get P < 1, so that there must be at least one matrix T correcting / errors. The 
crucial point is that this observation does not depend on n, but only on the 
rate- like parameters m/n and f /n. Let us make this behaviour a Definition: 

Definition 7.3 Let d be an integer. Then we say a pair (p, e) consisting of a 
coding rate \x and an error rate e is achievable, if for every n we can find an 
encoding E of \fJLfi \ d-level systems into n d-level systems correcting [en\ errors. 

Then we can paraphrase the last proposition as saying that all pairs (/x, e) 
with 

(1 - /* - 4e) log 2 d > H 2 (2s) (39) 

are achievable. This is all the input we need for the next section, although a 
better coding scheme, giving larger \x or larger e would also improve the rate 
estimates proved there. Such improvements are indeed possible. E.g. for the 
qubit case (d — 2) it is shown in || that there is allways a code which saturates 
the quantum Gilbert- Varshamov bound (1 — fj, — 2elog 2 (3)) > Hi(2e) which is 
slightly better than our result. 

But there are also known limitations, particularly the so-called Hamming 
bound. This is a simple dimension counting argument, based on the error cor- 
rectors dream: Assuming that the scalar product (P, G) 1— > uj{F*G) on the error 
space £ is non-degenerate, the dimension of the "bad space" is the same as the 
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Figure 5: Singleton bound and Hamming bound together with the rate achieved 
by random graoh coding (for d = 2). The allowed regions are below the respec- 
tive curve. 



dimension of the error space. Hence with the notations of Section [| we expect 
dim Tio ■ dim £ < dim TL2 ■ We now take m input systems and n output systems 
of dimension d each, so that dimWi = d m and dim7i2 = d n . For the space of 
errors happening at at most / places we introduce a basis s follows: at each site 
we choose a basis of B(TL) consisting of d 2 — 1 operators plus the identity. Then 
a basis of £ is given by all tensor products with basis elements 7^ I placed at 
j < f sites. Hence dim £ = J2j<f (p(^ 2 — 1) J - For large n we estimate this as 
in Lemma |o| as log dim £ w (/ /n) log 2 (d 2 — 1) + H-2{ f / n) . Hence the Hamming 
bound becomes 

- log 2 d + H 2 (e) + J - \og 2 {d 2 - 1) < log 2 d (40) 
n n 

which (with d 2 ^> 1) is just ( |39| ) with a factor 1/2 on all errors. 

If we drop the nondegeneracy condition made above it is possible to find 
codes which break the Hamming bound In this case, however, we can con- 
sider the weaker singleton bound, which has to be respected by those degenerate 
codes as well. It reads 

l-™>dl. (41) 
n n 

We omit its proof here (see Jl3| Sect. 12.4 instead). Both bounds are plotted 
together with the rate achieved by random graph coding in in Figure I (for 
d = 2). 
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8 Conclusions 



We are now ready to combine our discussion of channel-capacity from Section 
with the results about error correction we have derived in the previous sections. 
Please note that most of the result presented here can be found in |?], ^2|, in 
some cases with better bounds. 



8.1 Correcting small errors 

We first look at the problem which motivated our study, namely estimating the 
capacity of a channel T ss Id. 



Theorem 8.1 Let d be a prime, and let T be a channel on d-level systems. 
Suppose that for some < e < 1/2, 

||Id-T||cb < 2- H2 ^/ e . (42) 



Then 



Q(T) > (l-4e)log 2 (d)--ff2(2e) (43) 

Proof. For every n set / = \en\, and m = \jjlti\ — 1, where n is, up to a log 2 (d) 
factor, the right hand side of (||), i.e. (j, = 1 - 4e - log 2 (d) _1 iJ 2 (2£). This 
ensures that the right hand side of (38) is strictly negative, so there must be a 
code for d-level systems, with m inputs and n outputs, and correcting / errors. 
To this code we apply Proposition |6.l| , and insert the bound on || Id— T\\ c \, into 
Equation @. Thus A(T® n , d^ 11 ^ 1 ) -> 0, even exponentially. This means that 
any number < /ilog 2 (d) is an achievable rate. In other words, fx\og 2 (d) is a lower 
bound to the capacity. □ 



If e > is small enough the quantity on the right hand side of Equation 
43|) is strictly positive (cf. the dotted graph in Figure ||). Hence each channel 
which is sufficiently close to the identity allows (asymptotically) perfect error 
correction. Beyond that we see immediately that Q{T) is continous (in the cb- 
norm) at T = Id: Since Q(T) is smaller than log 2 (d) and 17(e) is continuous 
in e with g(Q) — log 2 (d) we find for each 6 > an e > exists, such that 
log 2 (d) - Q{T) < e for all T with \\T - Id || cb < e/e. In other words if T is 
arbitraril y clo se to the identity its capacity is arbitrarily close to log 2 (<i). In 
Corollary ^3 below we will show the significantly stronger statement that Q is 
a lower semicontinuous function on the set of all channels. 



8.2 Estimating capacity from finite coding solutions 

A crucial consequence of the ability to correct small errors is that we do not 
actually have to compute the limit defining the capacity: if we have a pretty 
good coding scheme for a given channel, i.e., one that gives us ET® n D w Id^, 
then we know the errors can actually be brought to zero, and the capacity is 
close to the nominal rate of this scheme, namely log 2 (d)/n. 

Theorem 8.2 Let T be a channel, not necessarily between systems of the same 
dimension. Let k,p e N with p a prime number, and suppose there are channels 
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E and D encoding and decoding a p-level system through k parallel uses of T , 
with error A = || Id p -ET® k D\\ ch < ±. Then 

Q(T) > ISIaMn _ 4e A) - -H 2 (2eA) . (44) 
n n 

Moreover, Q(T) is the least upper bound on all expressions of this form. 



Proof. We apply Proposition 8.1 to the channel T = ET® n D. With the random 
coding method we thus find a family of coding and decoding channels E and D 
from m' into n' systems, of p levels each, such that 

\\U-E(ET 9k D) 9n 'D\\ Dh ->0. (45) 

This can be reinterpreted as an encoding of p m -dimensional systems through 
kn' uses of the channel T (rather than T), which corresponds to a rate 
(kn')^ 1 log 2 (p m ) = (log 2 p/fc) {m'/n 1 ). We now argue exactly as in the proof 
of the previous proposition, with e = eA, so that 

|| Id p -ET 9k D\\ Bb = e/e < 2 H ^/ e (46) 

by equation (|37|). By random graph coding we can achieve the coding ratio 
/i w {m'/n') = 1- 4:8- \og 2 (p)- 1 H 2 (2e), and have the errors A(f ® n \p m ') go 
to zero exponentially. Since 

A(T® kn ',p m ') < A(f®"',p m ') < \\ld-E{ET® k D)® n ' \\ ch , (47) 



we can apply Lemma 3.2 to the channel T (where the sequence n a is given 
by n a = na) and find that the rate /i(log 2 p/fc) is achievable. This yields the 
estimate claimed in Equation (^i|). 

To prove the second statement consider the function x — > p(x) which asso- 
ciates to each real number x > 2 the biggest prime p{x) with p(x) < x. From 
known bounds on the length of gaps between two consecutive primes §0 it 
follows that liniz^oo x/p{x) = 1 holds, hence we get 2 kc /p(2 kc ) < 1 + 8' for an 
arbitrary 6' > 0, provided n is large enough, but this implies 

c i»g 2 [p(2 fcc )] , Wl+jQ _ 

k k 

Since we can choose an achievable rate c arbitrarily close to the capacity Q(T) 
this shows that there is for each S > a prime p and a positive integer k such 
that \Q{T) — \og 2 {p)/k\ < 5. In addition we can find a coding scheme E, D 
for T m such that Equation © holds, i.e. the right hand side of (|44|) can be 
arbitrarily close to log 2 (p)/fc, and this completes the proof. □ 

This theorem allows us to derive very easily an important continuity property 
of the quantum capacity. It is well known that each function F (on a topological 
space) which is given as the supremum of a set of real-valued, continuous func- 
tions is lower semicontinuous, i.e. the set F -1 ((a;, oo]) is open for each £ 6 ML 
Since the right hand side of Equation ( f44| ) is continuous in T and since Q{T) is 
(according to Proposition |8.2| ) the supremum over such quantities, we get: 

Corollary 8.3 T t— * Q(T) is lower semi-continuous in cb-norm. 

1 If p n denotes the n th _prime and g(p n ) = Pn+i — Pn is the length of the gap between p n 
and Pn+l it is shown in fel that g(p) is bounded by constp 5 / 8+e . 
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8.3 Error exponents 



Another consequence of Theorem 3.2 concerns the rate with which the error 
A(T® n , 2L cn J) decays in the limit n — > oo. Theorem 8.2 says, roughly speaking 
that we can achieve each rate c < Q(T) by combining a coding scheme E, D with 
subsequent random-graph coding E 7 D. Ho weve r, the error A[(£T®™ D)® 1 , p k ~\ 
decays according to ( |3~i| ) and Proposition |7.2| exponentially. A more precise 
analysis of this idea leads to the following (cf. also the work Hamada |Q ) : 

Proposition 8.4 If T is a channel with quantum capacity Q(T) and c < Q(T), 
then, for sufficiently large n we have 

A(T®",2 Lc " J ) < e- nX{c \ (49) 

with a positive constant A(c). 

Proof. We start as in Theorem O with the channel f = ET® k D and the 



quantity A = | Id p — ET® -D|| c b- However instead of assuming that A = e/e 
holds, the full range eA < e < 1/2 is allo wed for the error rate e. Using the 
same arguments as in the proof of Theorem 3.2 we get an achievable rate 



c(k,p,e) 



l°g 2 (p) 



l-4e 



H 2 (2e) 
log 2 (p) 



and an exponential bound on the coding error: 



) < \\ld-E(ET® k D 



< 



2 H2{e) A s 



(50) 



(51) 



cf. Equations (g) and @. 

To calculate the exponential rate A(c) with which the coding error vanishes 
we have to consider the quantity 

A(c) =liminf-ilnA(T®", [2 nc \) > lim — -n' In (2 H2 ^ A £ 

H 2 (eY 



> 



■s " ,<A) 



ln2- 



= -eA(A,e)/k 



(52) 
(53) 



where we have inserted inequality ( |5l| ) . Now we we can apply Lemma |3.2| (with 
the sequence n a = ka), which shows that A(c) is positive, if the right hand side 
of (H) is. 

What remains to show is that A(c) > holds for each c < Q(T). To this end 
we have to choose k,p, A and e such that c(k,p, e) — c and A(A, e) < 0. Hence 
consider 6 > such that c + 8 < Q{T) is an achievable rate. As in the proof of 



Theorem 8.2 we can choose log 2 (p)/fc such that log 2 (p)/fe > c + 5 holds while 
A is arbitrarily small. Hence there is an £o > such that c(fc,p, e) — c implies 
e > So- The statement therefore follows from the fact that there is a Ap > 
with A(A, e) > for all < A < A and e > e . □ 



In addition to the statement of Proposition 8.4 we have just derived a lower 



bound on the error exponent A(c). Since we can not express the error rate e as 
a function of k,p and c we can not specify this bound explicity. However we 
can plot it as a parametrized curve (using Equation ([50]) and (|3|) with e as the 
parameter) in the (c, A)-space. In Figure ^| this is done for k = 1, p = 2 and 
several values of A. 
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Figure 6: Lower bounds on the error exponent A(c) plotted for n = l,p = 2 and 
different values of A. 



8.4 Capacity with finite error allowed 

We can also tolerate finite errors in encoding. Let Q e (T) denote the quan- 
tity defined exactly like the capacity, but with the weaker requirement that 
A(T®", 2L C "J ) < e for large n. Obviously we have Q E (T) > Q(T) for each e > 0. 
Regarded as a function of e and T this new quantity admits in addition the 
following continuity property in e. 

Proposition 8.5 \im e ^ Q £ (T) = Q(T). 

Proof. By definition we can find for each s',6 > a tuple n,p,E and D such 
that 

\\U p -ET* n D\\ tb =^± (54) 
e 

and \Q £ (T) — l og 2 (p)/n| < S holds. If e + e' is small enough, however, we find 
as in Theorem 3.2 a random graph coding scheme such that 

Q(T) > l ^El(i-A(e + e')) - -H 2 (2(e + e'))= g(e + e'). (55) 
n n 

Hence the statement follows from continuity of g and the fact that g(0) = 
log 2 (p)/n holds. □ 

For a classical channel $ even more is known about the similar defined 
quantity C e {T): If e > is small enough we can not achieve bigger rates by 
allowing small errors, i.e. C(T) — C e {T). This is called the "strong converse 
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of Shannon's noisy channel coding theorem" [[l7|. To check whether a similar 
statement holds in the quantum case is one of the big open problem of the 
theory. 

Acknowledgements 

Funding by the European Union project EQUIP (contract IST-1999-11053) and 
financial support from the DFG (Bonn) is greatfully acknowledged. 

References 

fll A. Ashikhmin and E. Knill. Nonbinary quantum stabilizer codes. IEEE T. 
Inf. Theory 47, no. 7, 3065-3072 (2001). 

A. R. Calderbank, E. M. Rains, P. W. Shor and N. J. A. Sloane. Quantum 
error correction and orthogonal geometry. Phys. Rev. Lett. 78, no. 3, 405- 
408 (1997). 

A. R. Calderbank and P. W. Shor. Good quantum error- correcting codes 
exist. Phys. Rev. A 54, 1098-1105 (1996). 

D. P. DiVincenzo, P. W. Shor and J. A. Smolin. Quantum- channel capacity 
of very noisy channels. Phys. Rev. A 57, no. 2, 830-839 (1998). Erratum: 
Phys. Rev. A 59, 2, 1717 (1999). 

D. Gottesman. Class of quantum error- correcting codes saturating the quan- 
tum hamming bound. Phys. Rev. A 54, 1862-1868 (1996). 

D. Gottesman. Stabilizer codes and quantu m error correction. Ph.D. thesis, 
California Institute of Technology (1997). |quant-ph/9705052 . 



M. Hamada. Exponential lower bound on the highest fidelity achievable by 



quantum error- correcting codes, quant-ph /0109114 (2001) 



A. E. Ingham. On the difference between consecutive primes. Quart. J. 
Math., Oxford Ser. 8, 255-266 (1937). 



M. Keyl. Fundamentals of quantum information theory, quant-ph/0202122] 
(2001). 

E. Knill and R. Laflamme. Theory of quantum error- correcting codes. Phys. 
Rev. A 55, no. 2, 900-911 (1997). 

K. Kraus. States effects and operations. Springer, Berlin (1983). 

R. Matsumoto and T. Uyematsu. Lower bound for the quantum capacity 
of a discrete memoryless quantum channel. quant-ph/010515~t| (2001). 



M. A. Nielsen and I. L. Chuang. Quantum computation and quantum in- 
formation. Cambridge University Press, Cambridge (2000). 

V. I. Paulsen. Completely bounded maps and dilations. Longman Scientific 
& Technical (1986). 



22 



J. Preskill. Lecture notes for the course 'information for physics 219/com- 
puter science 219, quantum computation'. Caltech, Pasadena, California 
(1999). www.theory.caltech.edu/people/preskill/ph229. 

D. Schlingemann and R. F. Werner. Quantum error- correcting codes asso- 



ciated with graphs, ^uant-ph/0012111 (2000) 



C. E. Shannon. A mathematical theory of communication. Bell. Sys. Tech. 
J. 27, 379-423, 623-656 (1948). 

P. W. Shor. Algorithms for quantum computation: Discrete logarithms and 
factoring. In Proc. of the 35th Annual Symposium on the Foundations of 
Computer Science ( S. Goldwasser, editor), pages 124-134. IEEE Computer 
Science, Society Press, Los Alamitos, California (1994). 

P. W. Shor. Polynomial-time algorithms for prime factorization and dis- 
crete logarithms on a quantum computer. Soc. Ind. Appl. Math. J. Comp. 
26, 1484-1509 (1997). 

A. M. Steane. Multiple particle interference and quantum error correction. 
Proc. Roy. Soc. Lond. A 452, 2551-2577 (1996). 

W. F. Stinespring. Positive functions on C*-algebras. Proc. Amer. Math. 
Soc. pages 211-216 (1955). 

R. F. Werner. Quantum information theory - an invitation. In Quantum 
information ( G. Alber et. al., editor), pages 14-59. Springer (2001). 

W. K. Wootters and W. H. Zurek. A single quantum cannot be cloned. 
Nature 299, 802-803 (1982). 



23 



